{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# COMPSCI 389: Introduction to Machine Learning\n",
    "# Topic 10.1 Automatic Differentiation for Functions\n",
    "\n",
    "In this notebook we present basic methods for automatic differentiation in Python."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Autograd\n",
    "\n",
    "Autograd is a Python library that automatically differentiates native Python and NumPy code. It can handle a large variety of mathematical operations. The key feature of autograd is its ability to differentiate functions that are defined by Python code, using the actual code to compute the derivatives, which makes it applicable to a wide range of problems in ML. Autograd implements the reverse mode automatic differentiation approach describe in the lecture slides.\n",
    "\n",
    "The main function in autograd is `grad`, which computes the gradient (derivatives with respect to each input) of a scalar-valued function.\n",
    "\n",
    "First, install autograd\n",
    "\n",
    "> pip install autograd\n",
    "\n",
    "Next, let's import `grad` from `autograd`. We will also import `autograd.numpy`, which is a wrapped version of the standard `numpy` library, designed to work with autograd. That is, it contains all of the functions of `numpy`, but modified to allow for the calculation of derivatives."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "import autograd.numpy as np # Use the wrapped version of numpy that includes derivative computations\n",
    "from autograd import grad   # We will primary use the grad function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Defining a Function to Differentiate\n",
    "\n",
    "Next, let's define the same function from the lecture notes:\n",
    "\n",
    "$$\n",
    "f(x)=3x^2 + 2x.\n",
    "$$\n",
    "\n",
    "In Python:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(x):\n",
    "    return 3 * (x**2) + (2 * x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using Autograd to Compute the Derivative\n",
    "\n",
    "Next, let's compute the derivative of $f(x)$ at $x=5$ using the `grad` function. We should get the same result that we would get from an analytic solution:\n",
    "$$\n",
    "\\begin{align}\n",
    "f(x)=& 3x^2 + 2x\\\\\n",
    "\\frac{df(x)}{dx}=& \\frac{d}{dx}(3x^2 + 2x)\\\\\n",
    "=&6x + 2\\\\\n",
    "\\frac{df(x)}{dx}{\\huge |}_{x=5} =& 6(5)+2\\\\\n",
    "=&32.\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'The derivative is: 32.0.'"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "f_prime = grad(f)                                # Often \"prime\" is used to denote the derivative. f_prime here is the derivative of f.\n",
    "display(f\"The derivative is: {f_prime(5.0)}.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Warning\n",
    "\n",
    "**Note**: The following similar code results in an error. The issue is that differentation requires floating point inputs. So, passing an integer to `f_prime` results in an error. Notice that above we provided `5.0` as the argument, not `5`.\n",
    "\n",
    "**Note**: The code below causes an error, which stops the whole notebook from running when hitting \"Run all\". So, the line below is commented out. Uncomment it to confirm that it produces an error."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "# display(f\"The derivative is: {f_prime(5)}.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Functions with Mulitiple Inputs\n",
    "\n",
    "When the function being differentiated has more than one input, you can provide a second argument to `grad` that specifies which input to that the derivative with respect to. If this second value is not provided, it defaults to zero (indicating that the derivative should be taken with respect to the first input). To see how this works, let's differentiate:\n",
    "\n",
    "$$\n",
    "f(x,y)=3x^2 + 2y - 7,\n",
    "$$\n",
    "\n",
    "at $x=3$ and $y=5$. We can work this out analytically. The derivative with respect to $x$ is $6x$, which results in $18$ when evaluated ta $x=3$. The derivative with respect to $y$ is $2$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'The partial derivative w.r.t. x is: 18.0.'"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "'The partial derivative w.r.t. y is: 2.0.'"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def f(x, y):\n",
    "    return 3 * x**2 + 2 * y - 7\n",
    "\n",
    "partial_x = grad(f, 0)  # Partial derivative with respect to x. This is equivalent to grad(f).\n",
    "partial_y = grad(f, 1)  # Partial derivative with respect to y\n",
    "\n",
    "display(f\"The partial derivative w.r.t. x is: {partial_x(3.0, 5.0)}.\")\n",
    "display(f\"The partial derivative w.r.t. y is: {partial_y(3.0, 5.0)}.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When using ML models, the inputs that we differentiate w.r.t. will be the model parameters. These are a numpy array. We can reproduce the code above using numpy arrays and a single call to `grad`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'The gradient at [3. 5.] is [18.  2.]'"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# The same function, but taking a numpy array as input\n",
    "def f(inputs):\n",
    "    x, y = inputs\n",
    "    return 3 * x**2 + 2 * y - 7\n",
    "\n",
    "# Now, the gradient function returns the gradient with respect to the entire numpy array of inputs\n",
    "grad_f = grad(f)\n",
    "\n",
    "input = np.array([3.0, 5.0])    # Create the input for which we want the derivatives w.r.t.\n",
    "gradient = grad_f(input)        # Get the derivatives (the gradient)\n",
    "display(f\"The gradient at {input} is {gradient}\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}